首页> 外文OA文献 >RNABindRPlus: A Predictor that Combines Machine Learning and Sequence Homology-Based Methods to Improve the Reliability of Predicted RNA-Binding Residues in Proteins
【2h】

RNABindRPlus: A Predictor that Combines Machine Learning and Sequence Homology-Based Methods to Improve the Reliability of Predicted RNA-Binding Residues in Proteins

机译:RNABindRPlus:一种预测器,结合了机器学习和基于序列同源性的方法来提高蛋白质中预测的RNA结合残基的可靠性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.
机译:蛋白质-RNA相互作用对于重要的细胞过程(例如蛋白质合成和基因表达的调节)至关重要,并在人类传染性和遗传性疾病中发挥作用。可靠地鉴定蛋白质-RNA界面对于理解此类相互作用的结构基础和功能含义以及开发合理的药物设计有效方法至关重要。基于序列的计算方法为鉴定RNA结合蛋白中假定的RNA结合残基提供了一种可行的,具有成本效益的方法。在这里,我们报告两种新颖的方法:(i)HomPRIP,一种基于序列同源性的方法,用于预测蛋白质中的RNA结合位点; (ii)RNABindRPlus,一种新方法,它将HomPRIP的预测与优化的支持向量机(SVM)分类器的预测结合在一起,该分类器在198种RNA结合蛋白的基准数据集中进行了训练。尽管高度可靠,但HomPRIP无法预测查询蛋白的未比对部分,并且其覆盖范围受到查询蛋白具有实验确定的RNA结合位点的紧密序列同源物可用性的限制。 RNABindRPlus克服了这些限制。我们在两个测试集RB44和RB111上将HomPRIP和RNABindRPlus的性能与几个最新预测变量的性能进行了比较。在可以可靠鉴定与实验确定的界面同源物的蛋白质子集上,HomPRIP优于其他所有方法,其RB44的MCC为0.63,RB111的MCC为0.83。 RNABindRPlus能够预测两个测试集中所有蛋白质的RNA结合残基,MCC分别达到0.55和0.37,并且胜过所有其他方法,包括那些利用蛋白质的结构衍生特征的方法。更重要的是,RNABindRPlus在精度和查全率之间的任何折衷选择方面都胜过所有其他方法。 HomPRIP和RNABindRPlus的一个重要优点是它们依赖于RNA结合蛋白容易获得的序列和序列衍生特征。两种方法的网络服务器实现均可从http://einstein.cs.iastate.edu/RNABindRPlus/免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号